fix: 将上游空回复纳入回退重试并将空回复报错显示传递给用户#6454
Hidden character warning
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! 此拉取请求旨在改进系统对大型语言模型(LLM)空回复的处理。通过引入一个专门的辅助函数来识别空回复,并将其集成到现有的LLM回退机制中,确保当上游LLM提供者返回无意义的响应时,系统能够智能地尝试其他提供者。此外,当所有回退尝试都失败后,系统将抛出明确的错误,而不是静默失败,从而提高用户对模型执行结果的可见性和可操作性。 Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Consider making
_is_empty_llm_responsemore reusable by returning early whenresp.result_chain is Noneor has an emptychain, which will simplify the loop and avoid iterating when there's clearly no content. - Instead of only embedding
model_id,provider_id, andrun_idin theLLMEmptyResponseErrormessage string, consider adding them as explicit attributes on the exception so callers can programmatically inspect the context.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider making `_is_empty_llm_response` more reusable by returning early when `resp.result_chain is None` or has an empty `chain`, which will simplify the loop and avoid iterating when there's clearly no content.
- Instead of only embedding `model_id`, `provider_id`, and `run_id` in the `LLMEmptyResponseError` message string, consider adding them as explicit attributes on the exception so callers can programmatically inspect the context.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Related Documentation 1 document(s) may need updating based on files changed in this PR: AstrBotTeam's Space pr4697的改动View Suggested Changes@@ -1385,7 +1385,106 @@
---
-### 14. 其他优化
+### 14. 空响应检测与回退机制(PR #6454)
+
+#### 功能说明
+[PR #6454](https://github.com/AstrBotDevs/AstrBot/pull/6454) 增强了内置 Agent Runner 的空响应检测和回退处理,防止 LLM 返回空响应时导致的静默失败,确保用户能够及时获得清晰的错误提示。
+
+#### 问题背景
+修复前,当 LLM 返回空响应(无文本内容、无工具调用、无推理内容)时,系统仅记录警告日志 `"LLM returned empty assistant message with no tool calls."`,但不触发回退重试机制,也不向用户显示有意义的错误信息。这会导致:
+- 用户在等待响应时收到空白结果,无法理解发生了什么
+- 配置了多个回退 LLM 提供商时,系统不会尝试使用备用提供商
+- 空响应导致的问题难以排查,缺乏上下文信息
+
+#### 核心改进
+
+##### 1. 空响应检测辅助函数(`_is_empty_llm_response`)
+新增 `_is_empty_llm_response()` 辅助函数,用于检测 LLM 响应是否为空:
+
+**检测条件:**
+- `completion_text` 为空或仅包含空白字符
+- `reasoning_content` 为空或仅包含空白字符
+- `tools_call_args` 为空(无工具调用)
+- `result_chain` 无有效内容(不包含非空的 Plain 组件或其他类型组件,如图片、语音等)
+
+**实现细节:**
+- 对 `result_chain` 进行深度检查,跳过空的 Plain 组件
+- 非 Plain 组件(如 Image、Voice 等)视为有效内容
+- 返回 `True` 表示响应为空,`False` 表示响应有效
+
+##### 2. 回退重试机制集成
+在 `_iter_llm_responses_with_fallback()` 方法中集成空响应检测:
+
+**触发条件:**
+- LLM 响应角色为 `assistant` 或 `tool`
+- `_is_empty_llm_response()` 返回 `True`
+- 当前不是最后一个回退候选提供商
+
+**行为:**
+- 记录警告日志:`"Chat Model {candidate_id} returns empty response, trying fallback to next provider."`
+- 中断当前提供商的响应处理
+- 自动切换到下一个回退提供商
+- 流式响应的单个 chunk 不触发空响应检测(等待完整响应后再判断)
+
+##### 3. 新增异常类型(`LLMEmptyResponseError`)
+在 `astrbot/core/exceptions.py` 中新增 `LLMEmptyResponseError` 异常类型:
+
+**触发条件:**
+- 所有回退提供商均返回空响应
+- 系统无法通过回退机制获得有效响应
+
+**异常信息包含:**
+- `model_id`:当前使用的模型 ID
+- `provider_id`:当前使用的提供商 ID
+- `run_id`:当前运行的唯一标识符
+- 基础错误消息:`"LLM returned empty assistant message with no tool calls."`
+
+**异常格式示例:**
+```
+LLM returned empty assistant message with no tool calls. Context: model_id=gpt-4, provider_id=openai, run_id=abc123.
+```
+
+##### 4. 用户可见错误报告
+当所有回退提供商均失败后,系统会抛出 `LLMEmptyResponseError` 异常,用户将收到清晰的错误提示:
+
+**错误处理流程:**
+1. 系统尝试所有配置的回退提供商
+2. 如果所有提供商均返回空响应,抛出 `LLMEmptyResponseError`
+3. 错误消息包含上下文信息(model_id、provider_id、run_id),便于排障
+4. 用户收到明确的错误提示,而非静默失败或无意义的等待
+
+#### 技术实现要点
+
+##### 流式响应处理
+- 流式响应的单个 chunk 不立即触发空响应检测(`if resp.is_chunk: ...`)
+- 等待完整响应生成后(`resp.is_chunk=False`)再进行空响应检测
+- 避免因单个 chunk 的元数据或心跳消息触发误判
+
+##### 回退逻辑优化
+- 空响应检测仅在非最后一个回退候选提供商时触发回退
+- 确保系统尝试所有可用的回退选项
+- 最后一个提供商返回空响应时,抛出 `LLMEmptyResponseError` 而非继续重试
+
+##### 可观测性增强
+- 新增警告日志,记录触发回退的具体提供商
+- 异常消息包含完整的上下文信息,便于开发者和用户排障
+- 日志记录 LLM 响应为空的具体场景,帮助优化模型配置
+
+#### 用户收益
+- **避免静默失败**:空响应不再导致无声的失败,用户能够收到明确的错误提示
+- **自动回退**:配置多个提供商时,系统自动尝试备用提供商,提升可用性
+- **清晰错误信息**:包含上下文的错误消息帮助用户快速定位问题(模型配置、提供商状态等)
+- **提升可靠性**:防止空响应导致的无意义等待和用户体验下降
+
+#### 影响范围
+此修复适用于以下场景:
+- LLM 提供商返回空响应(无文本、无工具调用、无推理内容)
+- 配置了多个回退提供商的环境
+- 使用内置 Agent Runner(ToolLoopAgentRunner)的场景
+
+---
+
+### 15. 其他优化
- JWT 处理和错误处理机制增强,提升系统安全性和稳定性
- UI 细节优化,提升用户体验
- 日志与异常处理增强,便于问题追踪Note: You must be authenticated to accept/decline updates. |
There was a problem hiding this comment.
Code Review
本次 PR 的目标是处理上游 LLM 返回空回复的情况,通过增加回退重试机制和抛出明确的错误来改善用户体验。整体实现思路清晰,代码改动合理。
主要改动点包括:
- 在
tool_loop_agent_runner.py中新增了_is_empty_llm_response辅助函数来判断空回复,并在_iter_llm_responses_with_fallback中使用它来触发回退。 - 当所有回退尝试后仍然收到空回复时,会抛出
LLMEmptyResponseError异常,并将包含上下文信息的错误传递给用户。 - 在
exceptions.py中定义了新的LLMEmptyResponseError异常。
代码质量较高,但有两处可以进行重构以提高代码的简洁性和可维护性,具体请见我的评论。
| has_result_chain_content = False | ||
| if resp.result_chain and resp.result_chain.chain: | ||
| for comp in resp.result_chain.chain: | ||
| # Skip empty Plain components | ||
| if isinstance(comp, Comp.Plain): | ||
| if comp.text and comp.text.strip(): | ||
| has_result_chain_content = True | ||
| break | ||
| else: | ||
| # Non-Plain components (e.g., images, voice) are considered valid content | ||
| has_result_chain_content = True | ||
| break |
There was a problem hiding this comment.
检查 result_chain 是否包含有效内容的循环逻辑可以被 any() 和生成器表达式简化。这样代码会更简洁,也更符合 Python 的风格。
has_result_chain_content = False
if resp.result_chain and resp.result_chain.chain:
has_result_chain_content = any(
(isinstance(comp, Comp.Plain) and comp.text and comp.text.strip())
or not isinstance(comp, Comp.Plain)
for comp in resp.result_chain.chain
)| base_msg = "LLM returned empty assistant message with no tool calls." | ||
| model_id = getattr(self.run_context, "model_id", None) | ||
| provider_id = getattr(self.run_context, "provider_id", None) | ||
| run_id = getattr(self.run_context, "run_id", None) | ||
|
|
||
| ctx_parts = [] | ||
| if model_id is not None: | ||
| ctx_parts.append(f"model_id={model_id}") | ||
| if provider_id is not None: | ||
| ctx_parts.append(f"provider_id={provider_id}") | ||
| if run_id is not None: | ||
| ctx_parts.append(f"run_id={run_id}") | ||
|
|
||
| if ctx_parts: | ||
| base_msg = f"{base_msg} Context: " + ", ".join(ctx_parts) + "." | ||
|
|
||
| raise LLMEmptyResponseError(base_msg) |
There was a problem hiding this comment.
构造错误信息中上下文部分的代码可以通过使用字典和列表推导式来简化。这使得代码更简洁,并且在未来需要添加更多上下文属性时更易于扩展。
base_msg = "LLM returned empty assistant message with no tool calls."
ctx_info = {
"model_id": getattr(self.run_context, "model_id", None),
"provider_id": getattr(self.run_context, "provider_id", None),
"run_id": getattr(self.run_context, "run_id", None),
}
ctx_parts = [f"{k}={v}" for k, v in ctx_info.items() if v is not None]
if ctx_parts:
base_msg = f"{base_msg} Context: {', '.join(ctx_parts)}."
raise LLMEmptyResponseError(base_msg)
fix:将上游空回复纳入回退重试并将空回复报错显示传递给用户
PR(#5610)的分散提交
Modifications / 改动点
修改文件:
astrbot\core\agent\runners\tool_loop_agent_runner.py增加空回复判断辅助函数_is_empty_llm_response
在_iter_llm_responses_with_fallback方法中的async for resp in self._iter_llm_responses(include_model=idx == 0)循环中使用_is_empty_llm_response来判断回复是否合法 并在不合法时触发回退机制
为
LLM returned empty assistant message with no tool calls.警告增加自定义的错误提醒 避免静默 让用户能够及时了解到模型执行结果 避免无意义的等待修改文件:
astrbot\core\exceptions.pyLLMEmptyResponseError自定义错误 以表示空回复Screenshots or Test Results / 运行截图或测试结果
Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。
Summary by Sourcery
Handle empty LLM responses by triggering provider fallback and surfacing explicit errors to users when all fallbacks are exhausted.
Bug Fixes:
Enhancements: